[superlog] Downgrade recoverable AI tool errors from ERROR to WARN in job-level wide event by superlog-app[bot] · Pull Request #483 · databuddy-analytics/Databuddy

superlog-app · 2026-06-16T09:30:35Z

Summary

Insights generation jobs that encounter a recoverable AI tool failure (e.g. a ClickHouse SQL query error, an annotation creation returning NOT_FOUND) emit their final job-completion log at ERROR severity even when job_status: "succeeded". This creates false-positive Superlog incidents on every scheduled batch run.

Root cause

createToolLogger.error() in packages/ai/src/ai/tools/utils/logger.ts calls requestLogger.error(err, …), which sets evlog's internal hasError = true on the job-scoped wide event. When the job runner calls logger.emit({ job_status: "succeeded" }) at the end of a successful run, evlog's emit() fires at ERROR level because hasError is still set — even though the AI agent recovered from the tool failure and the job completed successfully.

Fix

Change the request-scoped path in createToolLogger.error() from requestLogger.error() to requestLogger.warn(). All error context (tool name, SQL snippet, error message) is still merged into the wide event at WARN severity. Genuine job failures continue to log at ERROR — apps/insights/src/jobs.ts calls logger.error(err) directly on the job-failure path, which is unaffected by this change. The log.error(…) fallback (used outside a job scope) is also unchanged.

Alternative approach: move the level-escalation guard into the job runner itself (only emit at ERROR when job_status: "failed"), but this would require changes to jobs.ts for every call site and doesn't fix the same pattern if it appears in other services using createToolLogger.

Incident on Superlog

Was this PR helpful? Leave feedback — goes straight to the Superlog team.

Summary by cubic

Downgraded recoverable tool errors from ERROR to WARN in job-scoped events to prevent false-positive incidents when a job ultimately succeeds. This keeps successful insights runs from emitting a final ERROR log.

Bug Fixes
- In packages/ai/src/ai/tools/utils/logger.ts, updated createToolLogger.error() to use requestLogger.warn(message, …) instead of requestLogger.error(err, …) so recoverable tool failures don’t escalate the wide event; real job failures still log at ERROR via the job runner.

^{Written for commit 0487ce1. Summary will update on new commits.}

… job-level wide event

vercel · 2026-06-16T09:30:39Z

The latest updates on your projects. Learn more about Vercel for GitHub.

Project	Deployment	Actions	Updated (UTC)
databuddy-status	Ready	Preview, Comment	Jun 16, 2026 9:31am

2 Skipped Deployments

Project	Deployment	Actions	Updated (UTC)
dashboard	Skipped		Jun 16, 2026 9:31am
documentation	Skipped		Jun 16, 2026 9:31am

unkey-deploy · 2026-06-16T09:30:59Z

The latest updates on your projects. Learn more about Unkey Deploy

Name	Status	Preview	Inspect	Updated (UTC)
api (preview)	Ready	Visit Preview	Inspect	Jun 16, 2026 9:31am

blacksmith-sh · 2026-06-16T09:32:40Z

Found 7 test failures on Blacksmith runners:

Failures

Test	View Logs
`[chromium] › test/e2e/specs/regressions/` `api-keys.spec.ts:4:5 › creates and deletes an API key without leaving confirmation dial` `ogs open @regression @core`	View Logs
`[chromium] › test/e2e/specs/regressions/` `website-analytics.spec.ts:31:5 › shows seeded analytics data and applies a topbar filte` `r @regression @core`	View Logs
`[chromium] › test/e2e/specs/regressions/` `website-analytics.spec.ts:7:5 › renders analytics controls in the website topbar @regre` `ssion @core`	View Logs
`[chromium] › test/e2e/specs/smoke/` `account.spec.ts:3:5 › updates the signed-in user's profile name @smoke`	View Logs
`[chromium] › test/e2e/specs/smoke/` `auth.spec.ts:12:5 › boots an authenticated browser session @smoke`	View Logs
`[chromium] › test/e2e/specs/smoke/` `auth.spec.ts:30:5 › signs out and protects authenticated routes @smoke @core`	View Logs
`[chromium] › test/e2e/specs/smoke/` `auth.spec.ts:3:5 › redirects unauthenticated visitors to sign in @smoke`	View Logs

greptile-apps · 2026-06-16T09:33:45Z

Greptile Summary

This PR fixes false-positive Superlog incidents caused by recoverable AI tool failures (e.g. a failed ClickHouse SQL query or a NOT_FOUND annotation response) escalating the job-level wide event to ERROR even when the job ultimately succeeded. The fix is a one-line change in createToolLogger.error(): the request-scoped call is switched from requestLogger.error() to requestLogger.warn(), preventing evlog from setting hasError = true on the wide event for these recoverable errors.

The root cause and fix are accurately diagnosed; the genuine job-failure path in jobs.ts (logger.error(err) → logger.emit({ job_status: \"failed\" })) is untouched and continues to emit at ERROR.
The change also silently removes new Error(message) (and its stack trace) in favour of passing the raw string to requestLogger.warn().
All 30+ logger.error() call sites across tool files are now uniformly downgraded to WARN within a request scope, with no remaining way for a caller to force ERROR level through createToolLogger.

Confidence Score: 4/5

Safe to merge for the stated goal; the job-failure path is untouched and genuine incidents will still fire at ERROR.

The change is small, well-reasoned, and correctly targets the evlog escalation mechanism. The main open question is whether blanket-downgrading every tool logger.error() call to WARN is the right long-term interface contract — currently there is no way for any tool to log at ERROR within a request scope, even for errors that may warrant it. This is a design trade-off worth discussing, not a blocking defect.

The single changed file packages/ai/src/ai/tools/utils/logger.ts and the 30+ tool files that call logger.error() — none of those callers currently have a way to override the WARN downgrade if a future error warrants ERROR-level visibility.

Important Files Changed

Filename	Overview
packages/ai/src/ai/tools/utils/logger.ts	Downgrades `requestLogger.error()` to `requestLogger.warn()` in the request-scoped path of `createToolLogger.error()`; also removes the `new Error(message)` wrapper, dropping stack trace capture. The core fix is sound but every `logger.error()` call site across 30+ tool usages is now silently at WARN with no way to escalate.

Sequence Diagram

%%{init: {'theme': 'neutral'}}%%
sequenceDiagram
    participant Tool as AI Tool (e.g. annotations.ts)
    participant TL as createToolLogger
    participant RL as requestLogger (evlog wide event)
    participant JR as jobs.ts (job runner)

    Note over JR,RL: Job starts — wide event created
    JR->>RL: createInsightsEventLog(context)

    Tool->>TL: logger.error("Failed to create annotation", ctx)

    alt BEFORE this PR
        TL->>RL: "requestLogger.error(new Error(msg), {aiTool, ...ctx})"
        Note over RL: hasError = true — wide event escalated to ERROR
        JR->>RL: "logger.emit({ job_status: "succeeded" })"
        Note over RL: emits at ERROR — false-positive incident
    else AFTER this PR
        TL->>RL: "requestLogger.warn(message, {aiTool, ...ctx})"
        Note over RL: hasError unchanged — wide event stays at INFO/WARN
        JR->>RL: "logger.emit({ job_status: "succeeded" })"
        Note over RL: emits at correct level — no false-positive
    end

    Note over JR,RL: Genuine failure path (unchanged)
    JR->>RL: logger.error(err)
    JR->>RL: "logger.emit({ job_status: "failed" })"
    Note over RL: emits at ERROR — real incident fires correctly

%%{init: {'theme': 'base', 'themeVariables': {"darkMode": true, "background": "#0d1117", "primaryColor": "#21262d", "primaryTextColor": "#e6edf3", "primaryBorderColor": "#8b949e", "lineColor": "#8b949e", "textColor": "#e6edf3", "edgeLabelBackground": "#161b22", "actorBkg": "#21262d", "actorBorder": "#8b949e", "actorTextColor": "#e6edf3", "actorLineColor": "#8b949e", "signalColor": "#8b949e", "signalTextColor": "#e6edf3", "noteBkgColor": "#373320", "noteBorderColor": "#d4a72c", "noteTextColor": "#f0e6c0", "labelBoxBkgColor": "#21262d", "labelBoxBorderColor": "#8b949e", "labelTextColor": "#e6edf3", "loopTextColor": "#e6edf3", "activationBkgColor": "#30363d", "activationBorderColor": "#8b949e"}}}%%
sequenceDiagram
    participant Tool as AI Tool (e.g. annotations.ts)
    participant TL as createToolLogger
    participant RL as requestLogger (evlog wide event)
    participant JR as jobs.ts (job runner)

    Note over JR,RL: Job starts — wide event created
    JR->>RL: createInsightsEventLog(context)

    Tool->>TL: logger.error("Failed to create annotation", ctx)

    alt BEFORE this PR
        TL->>RL: "requestLogger.error(new Error(msg), {aiTool, ...ctx})"
        Note over RL: hasError = true — wide event escalated to ERROR
        JR->>RL: "logger.emit({ job_status: "succeeded" })"
        Note over RL: emits at ERROR — false-positive incident
    else AFTER this PR
        TL->>RL: "requestLogger.warn(message, {aiTool, ...ctx})"
        Note over RL: hasError unchanged — wide event stays at INFO/WARN
        JR->>RL: "logger.emit({ job_status: "succeeded" })"
        Note over RL: emits at correct level — no false-positive
    end

    Note over JR,RL: Genuine failure path (unchanged)
    JR->>RL: logger.error(err)
    JR->>RL: "logger.emit({ job_status: "failed" })"
    Note over RL: emits at ERROR — real incident fires correctly

Comments Outside Diff (2)

packages/ai/src/ai/tools/utils/logger.ts, line 21-41 (link)

error() method permanently downgrades all callers to WARN — no escape hatch for genuine tool-level errors

Every call site across the tools (annotations, flags, funnels, goals, links, query, rpc — 30+ logger.error() call sites) is now silently logged at WARN in any request-scoped context, with no way for a caller to force ERROR level through createToolLogger. If a future tool encounters an error that is not recoverable but still gets caught-and-returned-as-a-tool-response (so the job itself succeeds), engineers will only see a WARN in the wide event with no alerting surface. The existing warn() method already exists on the returned object; callers that genuinely need WARN could have used it directly, leaving error() at ERROR for the cases that warrant it.

Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!
packages/ai/src/ai/tools/utils/logger.ts, line 21-33 (link)

Stack trace dropped when switching from Error object to plain string

The old code created new Error(message) and passed it to requestLogger.error(err, …), which lets evlog capture a stack trace pinning the exact tool call site. The new call passes message (a plain string) to requestLogger.warn(), so the stack is no longer attached. At WARN severity this is less critical, but it does make it harder to identify exactly which code path produced a logged tool failure during an investigation.

_{Reviews (1): Last reviewed commit: "[superlog] Downgrade recoverable AI tool..." | Re-trigger Greptile}

[superlog] Downgrade recoverable AI tool errors from ERROR to WARN in…

0487ce1

… job-level wide event

vercel Bot temporarily deployed to Preview – documentation June 16, 2026 09:30 Inactive

vercel Bot temporarily deployed to Preview – dashboard June 16, 2026 09:30 Inactive

unkey-deploy Bot deployed to links - preview June 16, 2026 09:30 View deployment

vercel Bot deployed to Preview – databuddy-status June 16, 2026 09:31 View deployment

unkey-deploy Bot deployed to api - preview June 16, 2026 09:31 View deployment

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[superlog] Downgrade recoverable AI tool errors from ERROR to WARN in job-level wide event#483

[superlog] Downgrade recoverable AI tool errors from ERROR to WARN in job-level wide event#483
superlog-app[bot] wants to merge 1 commit into
stagingfrom
superlog/fix-tool-logger-error-escalation

superlog-app Bot commented Jun 16, 2026 •

edited by cubic-dev-ai Bot

Loading

Uh oh!

vercel Bot commented Jun 16, 2026 •

edited

Loading

Uh oh!

unkey-deploy Bot commented Jun 16, 2026 •

edited

Loading

Uh oh!

blacksmith-sh Bot commented Jun 16, 2026 •

edited

Loading

Uh oh!

greptile-apps Bot commented Jun 16, 2026 •

edited

Loading

Comments Outside Diff (2)

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

0 participants

Uh oh!

Conversation

superlog-app Bot commented Jun 16, 2026 • edited by cubic-dev-ai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Root cause

Fix

Summary by cubic

Uh oh!

vercel Bot commented Jun 16, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

unkey-deploy Bot commented Jun 16, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

blacksmith-sh Bot commented Jun 16, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Failures

Uh oh!

greptile-apps Bot commented Jun 16, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Greptile Summary

Confidence Score: 4/5

Important Files Changed

Sequence Diagram

Comments Outside Diff (2)

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

0 participants

superlog-app Bot commented Jun 16, 2026 •

edited by cubic-dev-ai Bot

Loading

vercel Bot commented Jun 16, 2026 •

edited

Loading

unkey-deploy Bot commented Jun 16, 2026 •

edited

Loading

blacksmith-sh Bot commented Jun 16, 2026 •

edited

Loading

greptile-apps Bot commented Jun 16, 2026 •

edited

Loading